What’s new?
Ernesto Carrella
January 17, 2016
What’s new?
- Decision-Making
- Model Validation
- Policymaker agent
- Reinforcement Learning
- OSMOSE WFS
Decision-Making

Decision-Making
- Before:
- Explore-Exploit-Imitate
- Simple, adaptive, trial and error
- Now:
- 10 different algorithms
- Some use no imitation
- Some use more imitation
- Some build heatmaps
- All very adaptive
What for?

Model Validation
- Sensitivity Analysis for all patterns
- Model passes all ANTs
Failure is hilarious
Quota Gear - pattern

Quota Gear - test
| \(\epsilon\) |
0.2 |
0.05 |
| \(K\) |
5000 |
20000 |
| \(m\) |
0.001 |
0.07 |
| hold size |
100 |
10 |
| cell width |
10 |
20 |
| speed |
5.0 |
15 |
| gas price |
0.01 |
0.85 |
Quota Gear - demo

Policymaker agent

Policymaker agent

Policymaker agent
- Decision rule maps indicators to actions
- Decision rule parameters to optimise
Example

Reinforcement Learning
- You have indicators and actions
- You do not know a decision rule to map one to the other
- Can you find it by playing the model many times?
- Very much work in progress
- Very finicky method
- Results are opaque
- Learning it on the job
Reinforcement Learning - example
- 300 Fishers
- Can’t set quotas
- Can only open/close fishery each month
- Biomass and time of the year our only indicators.
- Train it 1000 episodes, \(\gamma = .999\)
Reinforcement Learning - result

Osmose WFS
- Ecosystem model
- Calibrated on fixed mortality
- We want to model the grouper fishers
- We have logbook data and logit fits
Fitting decision parameters

What’s next?
- Push for California and WFS
- Reinforcement Learning for Agents
- More policy-making